-
Notifications
You must be signed in to change notification settings - Fork 843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup uses of Array::data_ref (#3880) #3918
Conversation
13b9684
to
1ddbb77
Compare
@@ -1159,8 +1159,13 @@ pub struct LexicographicalComparator<'a> { | |||
impl LexicographicalComparator<'_> { | |||
/// lexicographically compare values at the wrapped columns with given indices. | |||
pub fn compare(&self, a_idx: usize, b_idx: usize) -> Ordering { | |||
for (data, comparator, sort_option) in &self.compare_items { | |||
match (data.is_valid(a_idx), data.is_valid(b_idx)) { | |||
for (nulls, comparator, sort_option) in &self.compare_items { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked at this seems to simply pull the match
up a level https://docs.rs/arrow-data/35.0.0/src/arrow_data/data/mod.rs.html#384-389 (as in there is no more dynamic dispatch from what I can see)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should run the benchmarks on this PR prior to merging
Overall though I think this is really nice progress
Some(nulls) => buffer_unary_not(nulls.buffer(), nulls.offset(), nulls.len()), | ||
let values = match input.nulls() { | ||
None => NullBuffer::new_null(input.len()).into_inner(), | ||
Some(nulls) => !nulls.inner(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these certainly do look very nice now 👌
The benchmarks do not show any major regressions, with most kernels completely unaffected. The boolean kernels seem to show a consistent 10% regression on some systems and a consistent 10% improvement on others, not entirely sure what is going on here but given we're talking 10s of nanoseconds I think it is probably just noise, and certainly not relevant to realistic workloads. |
Which issue does this PR close?
Part of #3880
Rationale for this change
The first of likely multiple PRs to gradually remove the direct usage of
ArrayData::data_ref
from the kernelsWhat changes are included in this PR?
Are there any user-facing changes?